-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SI-9047: Extract attributes from multiple elements #43
SI-9047: Extract attributes from multiple elements #43
Conversation
Allow querying for attributes when a NodeSeq has more than one element.
@@ -96,8 +96,7 @@ abstract class NodeSeq extends AbstractSeq[Node] with immutable.Seq[Node] with S | |||
def \(that: String): NodeSeq = { | |||
def fail = throw new IllegalArgumentException(that) | |||
def atResult = { | |||
lazy val y = this(0) | |||
val attr = | |||
this flatMap (y => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of x map (y => { ... })
you can x map { y => ... }
.
A comment on the original issue says that all issues are now tracked in this project; is there any more detail I can add for this to be merged? |
Hey @josephw , thank you for the submission of the pull request and patiently waiting for long time. Please give me some time to go through the request and get back to you. |
This looks good to me for merge. Please let us know if the community has any concerns and questions regarding this PR. |
Wouldn't this change likely break existing code? |
@ashawley Yes, it's a behavioural change, which could affect code that's relying on getting the empty list in the case of multiple elements. Does that seem like a likely case? We hit this when we were surprised by it not working when multiple elements were present so, in the absence of intentional tests to show the current behaviour is intentional, I think it's a fixable bug. (If the behavioural change is unacceptable, it would still be valuable to drop that part and change this into a test for the current behaviour.) |
I agree with @josephw and we should go with this fix rather than leaving the current behaviour. This current erroneous behaviour is surprising and shouldn't be default. Let us know if others have any thoughts about this. |
@josephw Yes, although the intentions here are good, the examples given in SI-9047 and carried over in to the unit tests show a slight misunderstanding of how the projection operator currently works in scala-xml. This is understandable because of how scala-xml's permissive type hierarchy is structured: The sequence types live alongside scalar values. This isn't scala-xml's fault really, since it's trying to acquiesce to XML. The methods are written in ways that are not always predictable for users. Here is projection for one child node:
Here is projection for two child nodes:
The above results seem predictable and reasonable. Things go a little awry when you use projection on different sized NodeSeqs.
Apparently, what you are supposed to do, is iterate on both results.
It's not clear what the motivation of this was, if it was intentional. I'm going to add some tests in a separate pull request. |
Thanks for the details.
That's exactly the change we made; it struck me as a workaround that should be unnecessary. So I can see how they work, but I wasn't convinced it was intentional or correct.
Agreed; I believe this version is less surprising. |
I think this example, for me, shows the current behaviour might not be intended one. Look at the difference between quieries x \ "a" \ "b" and x \ "a" \ "@b" . If the former query is supposed to work on nodeseq with working on each individuals again, then why not the same should be applied to the later ? Also
Should returning empty node seq for this case is fine ? Please let me know, if there are any wrong assumption and/or analysis I have made. |
The assumption is that an XML node ( Unfortunately, an attribute value is retrievable with this method from a sequence of XML nodes of length one. It has been for quite some time. This seems unintentional. I assume it is an accident of the implementation. It would not be caught by the compiler: The projection method is available from all types (including It seems like there are two choices available here:
|
Now lets see what is happening for operator "\" when working with attribute, for extra fun.
I assume at best we can say there is inconsistency between "" and "\" , both being used for projection. |
Yes, indeed. I've added tests on descendant attributes to issue 78. Thank you. As noted in a comment on scala/scala#1679, the |
Was going through the history of scala(library\xml) to see when this behaviour( extract attribute if node seq has single node) was introduced and it was around 08, introduced by @burakemir . I am still not able to dig any discussion around those. In essence this behaviour is there from a long time (from 08). |
Please let me know any reason community thinks about whether to take this patch or not. I am in two minds at the moment between the not institutive behaviour vs. long used behaviour. |
What would you merge? As it stands, this pull request has conflicts. After that, it would not pass a Travis build. It's hard to review something that doesn't work. Has anyone studied what the consequences of this change to scala-xml will be in either their own or in other projects that depend on scala-xml? I haven't. That would be good to know to make a decision. In the interest of clarity and getting more visibility on the consequences of this change, perhaps a new pull request should be started? |
…cting-attributes-from-multiple-elements
The larger point is, if this is a issue at all to be fixed or not. Patch is minimal, in the sense that reproducing it is not big deal. As for consequence, How we figure it out who is using the above mentioned feature, I am pretty unsure how we can go to estimated the number of users of the number of users for this particular feature. Let me see at least we can come up with hypothetical use cases and the effect of the changes to them. |
Hey all. It's been a very long time that I haven't worked with scala-xml, so I am not sure how useful this will be... Let me start by stating the things I can remember clearly from that time:
"""An important characteristic of the data model is that there is no distinction between an item (a node or an atomic value) and a singleton sequence containing that item. An item is equivalent to a singleton sequence containing that item and vice versa.""" This (let us call it the NodeSeq principle) leads to many lines of conflict with the rest of Scala's expressive static type system.
Now for the issue at hand, it does not really matter what I thought when I implemented "@" at the time. There were a few places where I had chosen to be inconsistent on purpose. I think, this might have been my way of easing the discrepancy between elements and objects, or it may have simplified some common use cases, and there were certainly some inconsistencies that I would not introduce again today (equality on NodeSeq comes to mind)... but then many people changed the library after me, too. Now that we got that out of the way, I do like the consistency argument that @ should behave the same for sequences of any length. This would be a natural consequence of the NodeSeq principle. Speaking without any authority, since I am not involved in scala-xml maintenance in any way: a change isn't necessarily bad just because it breaks some code, this is simply a matter of policy and choice of the maintainers. A "no breaking changes ever" policy seems overly restrictive. I'd think introducing breaking changes should be fine (after due discussion on mailing list and whatever decision-finding process the current maintainers are following), as long as you bump the major revision number ("semantic versioning"). If you don't have any releases, then it should be at least preceded by an announcement on the community mailing list. Finally, at the time, I had tried hard to keep documentation up-to-date. This change sounds like it would make some documentation obsolete as well. It seems consistency in a library like this will only really appreciated if users can perceive it easily by looking at code examples. Hope this helps! |
@burakemir Just very nice of you to take the time to chime in! Also glad that the NodeSeq principle now has a name. And that it is a principle. |
Thanks for the additional context! We all agreed it was awkward. Glad to have it blessed by @burakemir as a bug. Indeed, the fix will require some analysis of what nature the breaking change is, and deciding about whether it is a major revision number bump, requires announcements, changing documentation and so on. Maven reports that there are 286 uses of this library! http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11/usages This only counts the latest versions, and I'm sure there are many hundreds more unpublished projects with it as a dependency as well. Thanks again! |
I vote for a major version bump for this. Path of least cost for everyone involved, AFAICT. |
…cting-attributes-from-multiple-elements
Ensure that Group is returned, rather than NodeSeq, and adjust the newer test now that multiple attributes are successfully returned.
It's a shame to leave this unresolved. I've taken another look.
I've resolved the merge conflict. I've also made a couple of fixes necessitated by the tests added in 72b11f6 (good -- they caught bugs), as well as intentionally adjusting one of those tests to match the newer behaviour. |
@ashawley is this eligible to progress in 2.0, or should we just close it? |
closing for inactivity — but at least based on skimming the discussion, it seems like someone could revive it |
Ensure that
\ "@attr"
works as expected when theNodeSeq
has more than a single element in it. Make the existing implementation more general and remove the one-element special case.